310 research outputs found
Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge
Multiple automakers have in development or in production automated driving
systems (ADS) that offer freeway-pilot functions. This type of ADS is typically
limited to restricted-access freeways only, that is, the transition from manual
to automated modes takes place only after the ramp merging process is completed
manually. One major challenge to extend the automation to ramp merging is that
the automated vehicle needs to incorporate and optimize long-term objectives
(e.g. successful and smooth merge) when near-term actions must be safely
executed. Moreover, the merging process involves interactions with other
vehicles whose behaviors are sometimes hard to predict but may influence the
merging vehicle optimal actions. To tackle such a complicated control problem,
we propose to apply Deep Reinforcement Learning (DRL) techniques for finding an
optimal driving policy by maximizing the long-term reward in an interactive
environment. Specifically, we apply a Long Short-Term Memory (LSTM)
architecture to model the interactive environment, from which an internal state
containing historical driving information is conveyed to a Deep Q-Network
(DQN). The DQN is used to approximate the Q-function, which takes the internal
state as input and generates Q-values as output for action selection. With this
DRL architecture, the historical impact of interactive environment on the
long-term reward can be captured and taken into account for deciding the
optimal control policy. The proposed architecture has the potential to be
extended and applied to other autonomous driving scenarios such as driving
through a complex intersection or changing lanes under varying traffic flow
conditions.Comment: IEEE International Conference on Intelligent Transportation Systems,
Yokohama, Japan, 201
Autonomous Ramp Merge Maneuver Based on Reinforcement Learning with Continuous Action Space
Ramp merging is a critical maneuver for road safety and traffic efficiency.
Most of the current automated driving systems developed by multiple automobile
manufacturers and suppliers are typically limited to restricted access freeways
only. Extending the automated mode to ramp merging zones presents substantial
challenges. One is that the automated vehicle needs to incorporate a future
objective (e.g. a successful and smooth merge) and optimize a long-term reward
that is impacted by subsequent actions when executing the current action.
Furthermore, the merging process involves interaction between the merging
vehicle and its surrounding vehicles whose behavior may be cooperative or
adversarial, leading to distinct merging countermeasures that are crucial to
successfully complete the merge. In place of the conventional rule-based
approaches, we propose to apply reinforcement learning algorithm on the
automated vehicle agent to find an optimal driving policy by maximizing the
long-term reward in an interactive driving environment. Most importantly, in
contrast to most reinforcement learning applications in which the action space
is resolved as discrete, our approach treats the action space as well as the
state space as continuous without incurring additional computational costs. Our
unique contribution is the design of the Q-function approximation whose format
is structured as a quadratic function, by which simple but effective neural
networks are used to estimate its coefficients. The results obtained through
the implementation of our training platform demonstrate that the vehicle agent
is able to learn a safe, smooth and timely merging policy, indicating the
effectiveness and practicality of our approach
A Reinforcement Learning Approach for Intelligent Traffic Signal Control at Urban Intersections
Ineffective and inflexible traffic signal control at urban intersections can
often lead to bottlenecks in traffic flows and cause congestion, delay, and
environmental problems. How to manage traffic smartly by intelligent signal
control is a significant challenge in urban traffic management. With recent
advances in machine learning, especially reinforcement learning (RL), traffic
signal control using advanced machine learning techniques represents a
promising solution to tackle this problem. In this paper, we propose a RL
approach for traffic signal control at urban intersections. Specifically, we
use neural networks as Q-function approximator (a.k.a. Q-network) to deal with
the complex traffic signal control problem where the state space is large and
the action space can be discrete. The state space is defined based on real-time
traffic information, i.e. vehicle position, direction and speed. The action
space includes various traffic signal phases which are critical in generating a
reasonable and realistic control mechanism, given the prominent
spatial-temporal characteristics of urban traffic. In the simulation
experiment, we use SUMO, an open source traffic simulator, to construct
realistic urban intersection settings. Moreover, we use different traffic
patterns, such as major/minor road traffic, through/left-turn lane traffic,
tidal traffic, and varying demand traffic, to train a generalized traffic
signal control model that can be adapted to various traffic conditions. The
simulation results demonstrate the convergence and generalization performance
of our RL approach as well as its significant benefits in terms of queue length
and wait time over several benchmarking methods in traffic signal control
Behavior Planning of Autonomous Cars with Social Perception
Autonomous cars have to navigate in dynamic environment which can be full of
uncertainties. The uncertainties can come either from sensor limitations such
as occlusions and limited sensor range, or from probabilistic prediction of
other road participants, or from unknown social behavior in a new area. To
safely and efficiently drive in the presence of these uncertainties, the
decision-making and planning modules of autonomous cars should intelligently
utilize all available information and appropriately tackle the uncertainties so
that proper driving strategies can be generated. In this paper, we propose a
social perception scheme which treats all road participants as distributed
sensors in a sensor network. By observing the individual behaviors as well as
the group behaviors, uncertainties of the three types can be updated uniformly
in a belief space. The updated beliefs from the social perception are then
explicitly incorporated into a probabilistic planning framework based on Model
Predictive Control (MPC). The cost function of the MPC is learned via inverse
reinforcement learning (IRL). Such an integrated probabilistic planning module
with socially enhanced perception enables the autonomous vehicles to generate
behaviors which are defensive but not overly conservative, and socially
compatible. The effectiveness of the proposed framework is verified in
simulation on an representative scenario with sensor occlusions.Comment: To be appear on the 2019 IEEE Intelligent Vehicles Symposium (IV2019
Driving Decision and Control for Autonomous Lane Change based on Deep Reinforcement Learning
We apply Deep Q-network (DQN) with the consideration of safety during the
task for deciding whether to conduct the maneuver. Furthermore, we design two
similar Deep Q learning frameworks with quadratic approximator for deciding how
to select a comfortable gap and just follow the preceding vehicle. Finally, a
polynomial lane change trajectory is generated and Pure Pursuit Control is
implemented for path tracking. We demonstrate the effectiveness of this
framework in simulation, from both the decision-making and control layers. The
proposed architecture also has the potential to be extended to other autonomous
driving scenarios.Comment: This Paper has been submitted to ITSC 201
Quadratic Q-network for Learning Continuous Control for Autonomous Vehicles
Reinforcement Learning algorithms have recently been proposed to learn
time-sequential control policies in the field of autonomous driving. Direct
applications of Reinforcement Learning algorithms with discrete action space
will yield unsatisfactory results at the operational level of driving where
continuous control actions are actually required. In addition, the design of
neural networks often fails to incorporate the domain knowledge of the
targeting problem such as the classical control theories in our case. In this
paper, we propose a hybrid model by combining Q-learning and classic PID
(Proportion Integration Differentiation) controller for handling continuous
vehicle control problems under dynamic driving environment. Particularly,
instead of using a big neural network as Q-function approximation, we design a
Quadratic Q-function over actions with multiple simple neural networks for
finding optimal values within a continuous space. We also build an action
network based on the domain knowledge of the control mechanism of a PID
controller to guide the agent to explore optimal actions more efficiently.We
test our proposed approach in simulation under two common but challenging
driving situations, the lane change scenario and ramp merge scenario. Results
show that the autonomous vehicle agent can successfully learn a smooth and
efficient driving behavior in both situations.Comment: Machine Learning for Autonomous Driving Workshop on NeurIPS, 201
Automated Driving Maneuvers under Interactive Environment based on Deep Reinforcement Learning
Safe and efficient autonomous driving maneuvers in an interactive and complex
environment can be considerably challenging due to the unpredictable actions of
other surrounding agents that may be cooperative or adversarial in their
interactions with the ego vehicle. One of the state-of-the-art approaches is to
apply Reinforcement Learning (RL) to learn a time-sequential driving policy, to
execute proper control strategy or tracking trajectory in dynamic situations.
However, direct application of RL algorithms is not satisfactorily enough to
deal with the cases in the autonomous driving domain, mainly due to the complex
driving environment and continuous action space. In this paper, we adopt
Q-learning as our basic learning framework and design a unique format of the
Q-function approximator that consists of neural networks to handle the
continuous action space challenge. The learning model is present in a closed
form of continuous control variables and trained in a simulation platform that
we have developed with embedded properties of real-time vehicle interactions.
The proposed algorithm avoids invoking an additional actor network that learns
to take actions, as in actor-critic algorithms. At the same time, some prior
knowledge of vehicle dynamics is also fed into the model to assist learning. We
test our algorithm with a challenging use case - lane change maneuver, to
verify the practicability and feasibility of the proposed approach. Results
from accumulated rewards and vehicle performance show that RL vehicle agents
successfully learn a safe, comfort and efficient driving policy as defined in
the reward function
Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm
Lane change is a challenging task which requires delicate actions to ensure
safety and comfort. Some recent studies have attempted to solve the lane-change
control problem with Reinforcement Learning (RL), yet the action is confined to
discrete action space. To overcome this limitation, we formulate the lane
change behavior with continuous action in a model-free dynamic driving
environment based on Deep Deterministic Policy Gradient (DDPG). The reward
function, which is critical for learning the optimal policy, is defined by
control values, position deviation status, and maneuvering time to provide the
RL agent informative signals. The RL agent is trained from scratch without
resorting to any prior knowledge of the environment and vehicle dynamics since
they are not easy to obtain. Seven models under different hyperparameter
settings are compared. A video showing the learning progress of the driving
behavior is available. It demonstrates the RL vehicle agent initially runs out
of road boundary frequently, but eventually has managed to smoothly and stably
change to the target lane with a success rate of 100% under diverse driving
situations in simulation.Comment: Published at the 30th IEEE Intelligent Vehicles Symposium (IV), 201
Meta-Adversarial Inverse Reinforcement Learning for Decision-making Tasks
Learning from demonstrations has made great progress over the past few years.
However, it is generally data hungry and task specific. In other words, it
requires a large amount of data to train a decent model on a particular task,
and the model often fails to generalize to new tasks that have a different
distribution. In practice, demonstrations from new tasks will be continuously
observed and the data might be unlabeled or only partially labeled. Therefore,
it is desirable for the trained model to adapt to new tasks that have limited
data samples available. In this work, we build an adaptable imitation learning
model based on the integration of Meta-learning and Adversarial Inverse
Reinforcement Learning (Meta-AIRL). We exploit the adversarial learning and
inverse reinforcement learning mechanisms to learn policies and reward
functions simultaneously from available training tasks and then adapt them to
new tasks with the meta-learning framework. Simulation results show that the
adapted policy trained with Meta-AIRL can effectively learn from limited number
of demonstrations, and quickly reach the performance comparable to that of the
experts on unseen tasks.Comment: 2021 International Conference on Robotics and Automation (ICRA 2021
A Data Driven Method of Optimizing Feedforward Compensator for Autonomous Vehicle
A reliable controller is critical and essential for the execution of safe and
smooth maneuvers of an autonomous vehicle.The controller must be robust to
external disturbances, such as road surface, weather, and wind conditions, and
so on.It also needs to deal with the internal parametric variations of vehicle
sub-systems, including power-train efficiency, measurement errors, time
delay,so on.Moreover, as in most production vehicles, the low-control commands
for the engine, brake, and steering systems are delivered through separate
electronic control units.These aforementioned factors introduce opaque and
ineffectiveness issues in controller performance.In this paper, we design a
feed-forward compensate process via a data-driven method to model and further
optimize the controller performance.We apply the principal component analysis
to the extraction of most influential features.Subsequently,we adopt a time
delay neural network and include the accuracy of the predicted error in a
future time horizon.Utilizing the predicted error,we then design a feed-forward
compensate process to improve the control performance.Finally,we demonstrate
the effectiveness of the proposed feed-forward compensate process in simulation
scenarios.Comment: This paper have been submitted to the 2019 IEEE Intelligent Vehicle
Symposiu
- …